Building on the GPT-4o released last year, OpenAI has made significant updates to its advanced voice mode, making voice communication more natural and aligned with human conversation styles. This advanced feature is based on the native multimodal model, which can quickly respond to audio inputs, with the fastest response time at 232 milliseconds and an average response time of 320 milliseconds, nearly matching the speed of human conversations. At the beginning of this year, OpenAI had already made minor updates to this voice mode, improving interruption frequency and accent handling.